Decision Tree-Based Context Dependent Sublexical Units for Continuous Speech Recognition of Basque
نویسندگان
چکیده
This paper presents a new methodology, based on the classical decision trees, to get a suitable set of context dependent sublexical units for Basque Continuous Speech Recognition (CSR). The original method proposed by Bahl [1] was applied as the benchmark. Then two new features were added: a data massaging to emphasise the data and a fast and efficient Growing and Pruning algorithm for DT construction. In addition, the use of the new context dependent units to build word models was addressed. The benchmark Bahl approach gave recognition rates clearly outperforming those of context independent phone-like units. Finally the new methodology improves over the benchmark DT approach.
منابع مشابه
Selection of sublexical units for continuous speech recognition of basque
This paper describes the work carried out to select the most suitable set of Sublexical Units for Continuous Speech Recognition of Basque. Even if there are several dialects in Basque, only one of them has been used to choose the preliminary set of sounds. Bearing in mind this aim, a wide experimentation has been carried out to select Context Independent Phone-Like Units. Then, in order to obta...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملDecision trees for inter-word context dependencies in Spanish continuous speech recognition tasks
Context Dependent Units are broadly used in Continuous Speech Recognition (CSR) system, being decision trees a suitable clustering technique to obtain this kind of units. This work was aimed to extend the decision tree based clustering to model inter-word context dependencies in Spanish CSR tasks. We first used a set of previously defined context dependent units to model word boundaries. A deci...
متن کاملPhone transition acoustic modeling: application to speaker independent and spontaneous speech systems
HMM-based large vocabulary speech recognition systems usually have a very large number of statistical parameters. For better estimation, the number of parameters is reduced by sharing them across models. The parameter sharing is decided by regression trees which are built using phonetic classes designed either by a human expert or by data-driven methods. In situations where neither of these are...
متن کاملAcoustic modeling and language modeling for cantonese LVCSR
This paper describes our recent work on the development of a large-vocabulary, speaker-independent continuous speech recognition system for Cantonese (a major Chinese dialect). Both acoustic modeling and language modeling are being addressed. For acoustic modeling, we focus on right-context-dependent sub-syllable units. Tying of HMM at model as well as state level is applied based on phonetic k...
متن کامل